Skip to content

An openedx_catalog app with a representation for CourseRuns [FC-0117]#479

Open
bradenmacdonald wants to merge 29 commits intomainfrom
braden/catalog
Open

An openedx_catalog app with a representation for CourseRuns [FC-0117]#479
bradenmacdonald wants to merge 29 commits intomainfrom
braden/catalog

Conversation

@bradenmacdonald
Copy link
Contributor

@bradenmacdonald bradenmacdonald commented Feb 10, 2026

An implementation of #469.

Related platform PR: openedx/openedx-platform#38023

Notes

  • This adds a new dependency on edx-organizations (which also requires Pillow for the logo field)
  • This uses opaque-keys, which was already an indirect dependency but not used directly
  • Before this PR, only course runs have a display_name, and catalog courses do not have names nor exist in the core platform at all. I'm proposing we add a display_name field to the new core CatalogCourse model to support various use cases, including the proposed new studio home page. See the code for how this can be backfilled and how runs can always override the name for each run.
  • This PR does not really include a CRUD API (neither python nor REST) for manipulating CatalogCourse/CourseRun objects; just a minimal API that platform code can use to keep them in sync with CourseOverview.
  • However, this PR does include complete Django admin views that admins and developers can use for provisioning course runs, deleting course runs, testing data migrations, etc.
    • Screenshot 2026-02-21 at 4 43 26 PM
    • Screenshot 2026-02-21 at 4 44 19 PM
  • This current does NOT allow course runs to have a different course code or org code from other runs of the same catalog course. (This restriction could be loosened in the future.) However, the org code capitalization may vary among runs.
  • Currently, it is the course_overviews app that syncs data from modulestore -> CourseOverview -> openedx_catalog. It would be more robust and future-friendly to instead sync data directly from SplitModulestoreCourseIndex -> openedx_catalog , but it's harder to get information like display_name and language in the latter case as that's not available in the SplitModulestoreCourseIndex table. It could be retrieved from Mongo though.
  • This PR bakes into the data model the assumption that all runs of the same catalog course are in the same language. Discussion on Slack
  • Because these models are updated by CourseOverview based on the course_published signal, any test cases in platform that want to use these models have to be sure to enable that signal, which is disabled for test by default.
  • ⚠️ Because we rely on signals to update CourseRun when CourseOverview is updated, and CourseOverview itself is updated after courses are already created, it's possible that an error will occur and the CourseRun will never get created or updated. This will result in an error in the logs, but will not block course creation etc., so the error is likely to go unnoticed with the system as it exists today.

Architecture Diagram

See ARCHITECTURE.md.

Questions

  • Are org_code and course_code good terms to use? Should I call the latter number instead, like other parts of the code do? Should I call the org part org_short_name ?
  • Is the new url_code for each CatalogCourse useful? Do we want to make it editable now, or in the future?
  • Should we mark [parts of] this new catalog API as unstable?

Can be addressed later:

  • Should we make the new CourseRun a SoftDeletableModel to support course deletion without data loss? (Maybe we can't now, because soft-deleting it in that one table wouldn't affect the other tables that the system actually references. But in the future we could add this.)
  • Is "Course Schedule" a core concept that should live in this catalog app? I think yes, but it could also have a draft/publish workflow and ultimately live in openedx_content
  • Is "Course Visibility" a core concept that should be in this catalog app? (We actually have it defined as catalog_visibility, visible_to_staff_only, and course_visibility which all have different effects and different enum values, and also need to support "use system default")
  • Integrate authz ?
  • Check if we should deprecate edx-organization's OrganizationCourse
  • test what happens to enrollments if course is deleted then re-created with different capitalization (not even using this PR)

@openedx-webhooks openedx-webhooks added open-source-contribution PR author is not from Axim or 2U core contributor PR author is a Core Contributor (who may or may not have write access to this repo). labels Feb 10, 2026
@openedx-webhooks
Copy link

openedx-webhooks commented Feb 10, 2026

Thanks for the pull request, @bradenmacdonald!

This repository is currently maintained by @axim-engineering.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

  • If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
    • This process (including the steps you'll need to take) is documented here.
  • If it doesn't, simply proceed with the next step.
🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

  • Dependencies

    This PR must be merged before / after / at the same time as ...

  • Blockers

    This PR is waiting for OEP-1234 to be accepted.

  • Timeline information

    This PR must be merged by XX date because ...

  • Partner information

    This is for a course on edx.org.

  • Supporting documentation
  • Relevant Open edX discussion forum threads
🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

Details
Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

  • The size and impact of the changes that it introduces
  • The need for product review
  • Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

@github-project-automation github-project-automation bot moved this to Needs Triage in Contributions Feb 10, 2026
@mphilbrick211 mphilbrick211 moved this from Needs Triage to Waiting on Author in Contributions Feb 10, 2026
@bradenmacdonald bradenmacdonald added the FC Relates to an Axim Funded Contribution project label Feb 10, 2026
@bradenmacdonald bradenmacdonald changed the title An openedx_catalog app with a representation for CourseRuns An openedx_catalog app with a representation for CourseRuns [FC-0117] Feb 10, 2026
@ormsbee
Copy link
Contributor

ormsbee commented Feb 11, 2026

There is no stable/portable identifier for CatalogCourse objects, only their internal integer database ID. Do we need this? (An opaque key or something else?)

I don't know if we'll force it to be stable, but we'll probably want a SlugField(allow_unicode=True) so that we can address it in URLs in a descriptive looking way.

@bradenmacdonald bradenmacdonald changed the base branch from main to kdmccormick/root-packages February 13, 2026 19:51
Base automatically changed from kdmccormick/root-packages to main February 13, 2026 20:00
@bradenmacdonald bradenmacdonald force-pushed the braden/catalog branch 5 times, most recently from 87f845c to ccebc9d Compare February 19, 2026 01:27
@bradenmacdonald bradenmacdonald force-pushed the braden/catalog branch 3 times, most recently from 88e2f82 to bf0fbd6 Compare February 22, 2026 03:28
@bradenmacdonald bradenmacdonald marked this pull request as ready for review February 24, 2026 07:26
@kdmccormick kdmccormick self-requested a review February 24, 2026 16:23
@bradenmacdonald
Copy link
Contributor Author

There is no stable/portable identifier for CatalogCourse objects, only their internal integer database ID. Do we need this? (An opaque key or something else?)

I don't know if we'll force it to be stable, but we'll probably want a SlugField(allow_unicode=True) so that we can address it in URLs in a descriptive looking way.

Historical note: Apparently there was an old PR to add an opaque key for catalog courses, but it never merged: openedx/opaque-keys#87 Seems like they were called "aggregate courses" then.

Copy link
Member

@kdmccormick kdmccormick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review as I'm signing off for tonight, but overall really solid. I appreciate the comments and validation a lot. My only concerns so far are some superficial naming stuff.

help_text=_("The internal database ID for this course. Should not be exposed to users nor in APIs."),
editable=False,
)
course_id = CourseKeyField(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
course_id = CourseKeyField(
course_key = CourseKeyField(

Could we call this course_key? *_id in Django usually a foreign key, and I think it's unfortunate we've used course_id in so many places. I'd love to standardize on *_key for anything that's an OpaqueKey instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I should have done that in the first place. Done: 88aef75

Comment on lines +225 to +232
# Enforce that the course ID must end with "+run" where "run" is an exact match for the "run" field.
# This check may be removed or changed in the future if our course ID format ever changes
models.CheckConstraint(
# Note: EndsWith() on SQLite is always case-insensitive, so we code the constraint like this:
condition=Exact(Right("course_id", Length("run") + 1), Concat(models.Value("+"), "run")),
name="oex_catalog_courserun_courseid_run_match_exactly",
violation_error_message=_("The CourseRun 'run' field should match the run in the course_id key."),
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a sense that we'll eventually want to relax this in order to allow sites more flexibility on how they market their course runs, but I agree with adding the constraint for now and seeing how it plays out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out that I already had to relax it a bit, because it was breaking on CCX keys like ccx-v1:org+code+run+ccx@1 which don't end with +run. In fact, I realized CCX keys break a few different assumptions here - they also violate the constraint that (org, code, run) is unique per course run, because all CCX variants of a run have the same base course ID.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bradenmacdonald I'm rusty on the CCX data model. Is every CCX considered a "course run" by the existing system? I know a CCX is a LearningContext, but does every CCX have a CourseOverview row?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, every CCX instance is seen as a fully separate course run by most parts of the system (except Studio, which doesn't list them nor interact with them at all; CCX runs are created/edited/managed strictly through the LMS).

I just tested this now; here are the CourseOverviews of a CCX base course plus a CCX run created from it:
Screenshot 2026-03-02 at 9 53 33 AM

@kdmccormick kdmccormick self-requested a review February 26, 2026 23:57
Copy link
Member

@kdmccormick kdmccormick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(still reviewing)

return # It's a brand new Organization; we don't care

prev_org_code = Organization.objects.get(pk=instance.pk).short_name
new_org_code = instance.short_name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic below looks solid, but I'm curious what it handles that a this simpler logic wouldn't handle? ⬇️

if prev_org_code.lower() != new_org_code.lower():
    # If there are any runs, then changing the org code (other than capitalization) is forbidden.
    if CourseRun.objects.filter(catalog_course__org=instance).exists():
          raise ValidationError(...)

Is it so that if a CourseRun's course_key changes, then the org table can be updated to match? If so, do you mind dropping that in the comments?

Copy link
Contributor Author

@bradenmacdonald bradenmacdonald Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only such situation is this:

  • An Organization "MITx" is renamed to "MIT" -> meanwhile all the associated course runs have keys like course-v1:MIT+foo+bar. Don't throw an error, because this is now "more correct".

This situation shouldn't really be possible if you are using these models correctly, but the "org.short_name matches course_key.org" rule is not actually enforced by the database, because we can't write constraints across table boundaries, and it involves parsing an opaque key. So it is possible if you are using the .update() manager API or raw/custom SQL, or anything else that bypasses the checks in course_run.clean()

The other advantage is that it states an exact, example course key in the ValidationError.

That said, I don't think the situation described above is likely to occur so I'd be fine with simplifying this to your suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it more, I think I like what you have here. Generally speaking, it seems good to allow data changes towards correctness, rather than forbidding changes entirely. I can think of times where I've been burned by validation which whose intent was to keep my data correct, but it also effectively locked incorrect data into remaining incorrect. I'd say just drop a quick comment in to explain that, and keep it as-is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: a27856f

@kdmccormick kdmccormick self-requested a review February 27, 2026 22:29
Copy link
Member

@kdmccormick kdmccormick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few requests for more words, looks great otherwise. I'll a look at the platform PR soon.

return new_run


def delete_course_run(course_key: CourseKey) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add more to this docstring? What is and isn't deleted by this function? And will that change when CourseRun becomes authoritative?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure: a27856f

Let me know if you have any thoughts on that, as I wasn't totally sure how it should behave or how we see it evolving.

Comment on lines +225 to +232
# Enforce that the course ID must end with "+run" where "run" is an exact match for the "run" field.
# This check may be removed or changed in the future if our course ID format ever changes
models.CheckConstraint(
# Note: EndsWith() on SQLite is always case-insensitive, so we code the constraint like this:
condition=Exact(Right("course_id", Length("run") + 1), Concat(models.Value("+"), "run")),
name="oex_catalog_courserun_courseid_run_match_exactly",
violation_error_message=_("The CourseRun 'run' field should match the run in the course_id key."),
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bradenmacdonald I'm rusty on the CCX data model. Is every CCX considered a "course run" by the existing system? I know a CCX is a LearningContext, but does every CCX have a CourseOverview row?

# Note: display_name should never be blank. But we previously didn't store a name for catalog courses in the core.
# For backfilling, if there is only one run, we use that run's name as the catalog course name. Otherwise, we can
# use the org + course code as the display name.
display_name = case_insensitive_char_field(
Copy link
Member

@kdmccormick kdmccormick Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't openedx_content use title instead of display_name? Would it make sense to use title here and in CourseRun too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's what we're moving toward, sure. I don't have any preference.

I'm wondering if we need a little guide of term preferences somewhere. vertical -> unit, sequential -> subsection, chapter -> section, course_id -> course_key, number -> course_code, display_name -> title, etc. There are a lot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should also mention that OLX uses display_name at every level, so we probably can't really change that completely. I do think title is better though.

bradenmacdonald and others added 2 commits March 2, 2026 14:18
Co-authored-by: Kyle McCormick <kyle@axim.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core contributor PR author is a Core Contributor (who may or may not have write access to this repo). FC Relates to an Axim Funded Contribution project open-source-contribution PR author is not from Axim or 2U

Projects

Status: Waiting on Author

Development

Successfully merging this pull request may close these issues.

4 participants